Improved Inapproximability For Submodular Maximization 
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Abstract 



We show that it is Unique Games-hard to approximate the maximum of a submodular function 
to within a factor 0.695, and that it is Unique Games-hard to approximate the maximum of a 
symmetric submodular function to within a factor 0.739. These results slightly improve previous 
results by Feige, Mirrokni and Vondrak (FOCS 2007) who showed that these problems are NP- 
hard to approximate to within 3/4 + e « 0.750 and 5/6 + e « 0.833, respectively. 

1 Introduction 

Given a ground set U, consider the problem of finding a set S C U which maximizes some function 
f : 2 U — > M + which is submodular, i.e., satisfies 

t^: f(SuT) + f(SnT)<f(S) + f(T). 

rn 

| for every S, T C U. The submodularity property is also known as the property of diminishing returns, 

since it is equivalent with requiring that, for every 5 C T C U and i G U \ T, it holds that 



f(TU{i})-f(T)<f(SU{i})-f(S). 

There has been a lot of attention on various submodular optimization problems throughout the years 
(e.g., ifTOl |7J |2), see also the first chapter of Ifl4l for a more thorough introduction). Many natural 
problems can be cast in this general form - examples include natural graph problems such as maximum 
cut, and many types of combinatorial auctions and allocation problems. 

A further restriction which is also very natural to study is symmetric submodular functions. These 
are functions which satisfy f(S) = f(S) for every S C U, i.e., a set and its complement always 
have the same value. A well-studied example of a symmetric submodular maximization problem is 
the problem to find a maximum cut in a graph. 

Since it includes familiar NP-hard problems such as maximum cut as a special case, submodular 
maximization is in general NP-hard, even in the symmetric case. As a side note, a fundamental and 
somewhat surprising result is that submodular minimization has a polynomial time algorithm J4]]. 

To cope with this hardness, there has been much focus on efficiently finding good approximate 
solutions. We say that an algorithm is an a-approximation algorithm if it is guaranteed to output a set 
S for which f(S) > a ■ /(5qpt) where Sopt is an optimal set. We also allow randomized algorithms 
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in which case we only require that the expectation of f(S) (over the random choices of the algorithm) 
is at least a ■ /(Soft)- 

In many special cases such as the maximum cut problem, it is very easy to design a constant 
factor approximation (in the case of maximum cut it is easy to see that a random cut is a 1/2- 
approximation). For the general case of an arbitrary submodular functions, Feige et al. gave a 
(2/5 — o(l))-approximation algorithm based on local search, and proved that a uniformly random set 
is a 1/2-approximation for the symmetric case. The (2/5 — o(l))-approximation has been slightly 
improved by Vondrak [13 ] who achieved a 0.41-approximation algorithm, which is currently the best 
algorithm we are aware of. 

Furthermore, Q proved that in the (value) oracle model (where the submodular function to be 
maximized is given as a black box), no algorithm can achieve a ratio better than 1/2 + e, even in 
the symmetric case. However, this result says nothing about the case when one is given an explicit 
representation of the submodular function - say, a graph in which one wants to find a maximum cut. 
Indeed, in the case of maximum cut there is in fact a 0.878-approximation algorithm, as given by a 
famous result of Goemans and Williamson 0. In the explicit representation model, the best current 
hardness results, also given by [2], are that it is NP-hard to approximate the maximum of a submodular 
function to within 3/4 + e in the general case and 5/6 + e in the symmetric case. 

1.1 Our Results 

In this paper we slightly improve the inapproximability results of (H. However, as opposed to 
we do not obtain NP-hardness but only hardness assuming Khot's Unique Games Conjecture (UGC) 
0. The conjecture asserts that a problem known as Unique Games, or Unique Label Cover, is very 
hard to approximate. See e.g. O for more details. While the status of the UGC is quite open, our 
results still imply that obtaining efficient algorithms that beat our bounds would require a fundamental 
breakthrough. 

For general submodular functions we prove the following theorem. 

Theorem 1.1. It is UG-hard to approximate the maximum of a submodular function to within a factor 
0.695. 

In the case of symmetric functions we obtain the following bound. 

Theorem 1.2. For every e > it is UG-hard to approximate the maximum of a symmetric submodular 
function to within a factor 709/960 + e < 0.739 

These improved inapproximability results still fall short of coming close to the 1/2-barrier in the 
oracle model. Unfortunately, while marginal improvments of our results may be possible, we do not 
believe that our approach can come close to a factor 1/2. It remains a challenging and interesting open 
question to determine the exact approximability of explicitly represented submodular functions. 

1.2 Our Approach 

As in [2], the starting point of our approach is hardness of approximation for constraint satisfaction 
problems (CSPs), an area which, due to much progress during the last 15 years, is today quite well 
understood. Here it is useful to take a slightly different viewpoint. Instead of thinking of the family of 
subsets 2 U of U, we consider the set of binary strings {0, l} n of length n = \U\, indentified with 2 
in the obvious way. These views are of course equivalent and throughout the paper we shift between 
them depending on which view is the most convenient. 
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For a string x € {0, l} n and a /c-tuple C € [n] fc of indices, let xc € {0, l} fc denote the string of 
length k which, in position j G [A;] has the bit xc ■ Now, given a function / : {0, l} k — > R + , we define 
the problem Max CSP + (/) as follows. An instance of Max CSP + (/) consists of a list of A;-tuples 
of variables C\, . . . , C m £ [n] k . These specify a function F : {0, l} n -¥ R + by 

1 m 

i=l 

and the problem is to find an x E {0, l} n to maximize x. 

Note that if / is submodular then every instance F of Max CSP + (/) is submodular and Max CSP + (/) 
is a special case of the submodular maximization problem. 

Next, we use a variation of a result by the author and Mossel [ Q. The result of [T] is for CSPs where 
one allows negated literal^, which can not be allowed in the context of submodular maximization. 
However, in Theorem l3.2l we give a simple analogue of the result of [lj for the MAX CSP + (/) setting. 

Roughly speaking the hardness result says the following. Suppose that there is a pairwise inde- 
pendent distribution \i such that the expectation of / under \i is at least c, but that the expectation of / 
under the uniform distribution is at most s. Then MAX CSP + (/) is UG-hard to approximate to within 
a factor of s/c. 

The hardness result suggests the following natural approach: take a pairwise independent distri- 
bution /i with small support, and let 1 M : {0, l} k — >■ {0, 1} be the indicator function of the support 
of /j,. Then take / to be a "minimum submodular upper bound" to 1^, by which we mean a submod- 
ular function satisfying f(x) > for every x while having small expectation under the uniform 
distribution. 

To make this plan work, there are a few small technical complications (hidden in the "roughly 
speaking" part of the description of the hardness result above) that we need to overcome, making the 
final construction slightly more complicated. Unfortunately, understanding the "minimum submodular 
upper bound" of the families of indicator functions that we use appears difficult, and to obtain our 
results, we resort to explicitly computing the resulting submodular functions for small k. 

Let us compare our approach with that of Q. As mentioned above, their starting point is also 
hardness of approximation for constraint satisfaction. However, here their approach diverges from 
ours: they construct a gadget reduction from the fc-LlN problem (linear equations mod 2 where each 
equation involves only k variables). This gadget introduces two variables x® and x\ for every variable 
Xi in the fc-LlN instance, and each equation Xi x ffi . . . © Xi k = b is replaced by some submodular 
function / on the 2k new variables corresponding to the Xj . 's. The analysis then has to make sure that 
there is always an optimal assignment where for each i exactly one of x® and xj equals 1, which for 
the inapproximability of 3/4 becomes quite delicate. In our approach, which we feel is more natural 
and direct, we don't run into any such issues. 

1.3 Organization 

In Section [2] we set up some more notation that we use throughout the paper and give some additional 
background. In Section [3] we describe the hardness result that is our starting point. In Section @] we 
describe in more detail the construction outlined above, and finally, in Section [51 we describe how to 
obtain the concrete bounds given in Theorems ll.ll andl l.2l 

'Where each "constraint" f(xci) of F is of the more general form f(xc t + h) for some U 6 {0, l} h , where + is 
interpreted as addition over GF(2) . 
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2 Notation and Background 



Throughout the paper, we identify binary strings in {0, l} n and subsets of [n] in the obvious way. 
Analogously to the notation |5| and S for the cardinality and complement of a subset S C [n] we use 
\x\ and x for the Hamming weight and coordinatewise complement of a string x € {0, l} n . 



2.1 Submodularity 

Apart from the two definitions in the introduction, a third characterization of submodularity is that a 
function / : 2 X — > M + is submodular if and only if 

f(S) - f(s u {i}) - f(s u {j}) + f(s u {i} u {j}) < o (i) 

for every SCI, and i,jEX\S,i^ j. It is straightforward to check that this condition is equivalent 
to the diminishing returns property mentioned in the introduction. 



2.2 Probability 

For p G [0, 1], we use {0, 1}^ to denote the fc-dimensional boolean hypercube with the p-biased 

1 r (p) 



product distribution, i.e., if x is a sample from {0, 1}? \ then the probability that the i'th coordinate 



x i = 1 is p, independently for each i £ [k] . 

We abuse notation somewhat by making no distinction between probability distribution functions 
[i : {0, l} k — > [0, 1] and the probability space ({0, l} k , fi) for such fj,. Hence we write, e.g., fi(x) for 
the probability of x G {0, l} k under /i and ^ x ~ii[f( x )] f° r tne expectation of a function / : {0, l} k — > 
R under /x. 

A distribution fi over {0, l} k is balanced pairwise independent if every two-dimensional marginal 
distribution of fi is the uniform distribution, or formally, if for every 1 < i < j < n and 6i, 62 € {0, 1}, 
it holds that 

Pr [xi = b\ A Xj = b 2 ] = 1/4. 

Recall that the support Supp(/i) of a distribution fi over {0, l} fc is the set of strings with non-zero 
probability under /j,, i.e., Supp( / Lt) = { x <G {0, l} k : pi(x) > }. 
We conclude this section with a lemma that will be useful to us. 

Lemma 2.1. Let f : {0, l} fc — > M + be a symmetric set function. For t £ [0, k] let a(t) denote the 
average of f on strings of weight x, a(t) = X)|x|=t f( x )- V a JJ monotonely nondecreasing in 

[0, k/2], then the maximum average of f under any p-biased distribution is achieved by the uniform 
distribution. I.e., 

max E \f(x)} = 2~ x V fix) 

This intuitively obvious lemma is probably well known but as we do not know a reference we give 
a proof here. 

Proof. First, we note that without loss of generality we may assume that f(x) is the indicator function 
of the event k/2 — d < \x\ < k/2 + d for some d G [0, k/2]. This is because any / as in the 
statement of the lemma can be written as a nonnegative linear combination of such indicator functions 
for different d and if the average of each of these indicator functions is maximized for p = 1/2 then 
so is the average of /. 
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Define f\ : {0, l} k — > {0,1} as the indicator function of the event \x\ > k/2 — d and / 2 : 
{0, l} k — > {0, 1} as the indicator function of the event |x| > k/2 + d, so that /(x) = fi(x) — /2(x). 
Let ej (p) denote the average of fj under the p-biased distribution and e(p) = e% (p) — e2 (p) the average 
of / under the p-biased distribution. 

We will prove that e'(p) > for p < 1/2 (this is sufficient since we have e(p) = e(l — p) for 
symmetry reasons), or in other words that e^(p) > e' 2 (p). Now, /i and / 2 are indicator functions 
of monotone events and therefore e[ (p) and e' 2 (p) can be computed by the Margulis-Russo Lemma 
Hill: 

Lemma 2.2. (Margulis-Russo) Let f : {0, l} fc — > {0, 1} fee monotone. For x G {0, l} fc and i 6 [fc] Ze? 
x \ i denote x with the i 'th coordinate set to 0, arcc? let xL) i denote x with the i 'th coordinate set to 1. 
Then 

(> ' [/(*)]=£ Pr , [/(x\i) = 0A/(xUi) = l]. 



E 



dp x~{0,l}*; 



(p) 



Pr 



Applying Margulis-Russo to the monotone functions /i and /2, and using that they depend only 
on [x | it follows that (assuming without loss of generality that d is such that k/2 — d is an integer): 



ei(p) 



Pr |x 

fe-i 

(p) 



k/2-d- 1] • k 



Pr [|x 



k/2 + d]-k 



Hence to prove e' x (p) > e' 2 (p) we have to prove that, for every p < 1/2 



Pr 



k-1 



x 



> 



Pr 

z~{0,l} 



A; 



x 



^ + < d + i 



This in turn follows immediately from Pr 



z~{0,l} 



k-l X 

(p) 



K(i-p) 



fc— 1— vj 



since: 



P r rn i ik-1 



fc-1 

2 



2 



+ (<*+§)] p¥+( rf +5>(l- J) )¥-( d +5) 



2d+l 



> 1. 



□ 



3 Hardness from Pairwise Independence 

In this section we state formally the variation of the hardness result of [1] that we use. We first define 
the parameters which control the inapproximability ratio that we obtain. 

Definition 3.1. Let / : {0, l} k — s- M. + be a submodular function. 

We define the completeness c^(f) off with respect to a distribution \x over {0, l} fe by the expected 
value of / under p, i.e., 

c p (f) := E [f(x)} 

We define the soundness s p (f) of f with respect to bias p by the expected value of / under the 
p-biased distribution, i.e., 

s p (f) := E [/(x)]. 
*~{0,l}^) 
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Finally, we define the soundness s(f) of f by its maximum soundness with respect to any bias, 

i.e., 

s(f) : = max s p (f) 
pe[o,i] 

We can now state the hardness result. 

Theorem 3.2. Let p, be a balanced pairwise independent distribution over {0, l} k . Then for every 
objective function f : {0, \} k -4 M + and e > 0, given a MAX CSP + (/) instance F : {0, l} n -4 M + 
/? UG-hard to distinguish between the cases: 

Yes: There is an S C X smc/i ?/ia? i^S 1 ) > c„(/) — e. 

No: For every S Q X it holds that F(S) < s(f) + e. 

The proof of Theorem 13.21 follows the proof of CO almost exactly. For the sake of completeness, 
we give a bare bones proof in Appendix lAl 

Consequently, for any submodular function / and pairwise independent distribution p with all 
marginals equal, it is UG-hard to approximate Max CSP + (/) to within a factor s(f)/c fl (f) + e for 
every e > 0. Note also that the No case is the best possible: there is a trivial algorithm which finds a 
set such that F(S) > s(f) for every F, by simply letting each input be 1 with probability p for the p 
that maximizes s p (f). 

As a somewhat technical remark, we mention that Theorem [3^2] still holds if p is not required to be 
balanced - it suffices that all the one-dimensional marginal probabilities Pr^^a;, = 1] are identical, 
not necessarily equal to 1/2 as in the balanced case. We state the somewhat simpler form since that 
is sufficient to obtain our results for submodular functions and since that makes it more similar to the 
result of HI, which requires the distribution p to be balanced. 

Let us then briefly discuss the difference between Theorem 13.21 and the main result of (TJ. First, 
the result of [ 1 ] only applies in the more general setting when one allows negated literals, which is 
why it can not be used to obtain inapproximability for submodular functions. On the other hand, 
this more general setting allows for a stronger conclusion: in the No case, [T] achieves a soundness 
°f s i/2(/) + e which in general can be much smaller than s(f). As an example, consider the case 
when / : {0, l} 3 -)■ {0, 1} is the logical OR function on 3 bits. In this case the Max CSP + (/) 
problem is of course trivial - the all-ones assignment satisfies all constraints - and s(f) = 1, whereas 
s i/2{f) = 7/8. Letting p be the uniform distribution on strings of odd parity (it is readily verified 
that this is a balanced pairwise independent distribution) one gets c M (/) = 1, showing that the Max 
k-SAT problem is hard to approximate to within 7/8 + e. 

4 The Construction 

In this section we make formal the construction outlined in Section [L2l 

Theorem 13.21 suggests the following natural approach: pick a pairwise independent distribution p 
over {0, l} k and letl^ : {0,l} fc — > {0,1} be the indicator function of the support of p. Then take / to 
be a "minimum submodular upper bound" to 1 M , by which we mean a submodular function satisfying 
f(x) > ln(x) for every x while having s(f) as small as possible (whereas c M (/) is clearly at least 1). 
Note that the smaller the support of p, the less constrained / is, meaning that there should be more 
room to make s(f) small. 

To this end, let us make the following definition. 
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Definition 4.1. For a subset C C {0, l} fe , we denote by SM(C) the optimum function / : {0, l} k — > 
IR + of the following program^: 

Minimize s(f) 

Subject to f(x) > 1 for every x G C 
f is submodular 

In addition, we write SM p (C) for the optimal / when the objective to be minimized is changed to 
s p {f) instead of s(f). Analogously, we define SM sym (C) and SM* ym (C) as the optimal / with the 
additional restriction that / is symmetric. 

While the objective function s(f) is not linear (or even convex), it turns out that for the C's that we 
are interested in, SM(C) is actually quite well approximated by SMi/ 2 (C), i.e., we simply minimize 
J2x f( x ) (* n f act > we even believe that for our C's SMi/ 2 (C) gives the exact optimum for SM(C), 
though we have not attempted to prove it). The advantage of considering SMi/ 2 (C) is of course that it 
is given by a linear program, which gives us a reasonably efficient way of finding it. Armed with this 
definition, let us now describe the constructions we use. 

4.1 The Asymmetric Case 

The family of pairwise independent distributions /x that we consider is a standard construction based 
on the Hadamard code. Fix a parameter I > and let k = 2 l — 1. We identify the set of coordinates 
[k] with the set of non-empty subsets of [I] , in some arbitrary way. A string x from the distribution /j 
is sampled as follows: pick a uniformly random string y G {0, 1}' and defining, for each / T C [I], 
the coordinate xt = 0j e r Hi- 

This construction already has an issue: since the all-zeros string is in the support of the distribu- 
tion, any submodular upper bound to 1 M must have /(0) > 1, implying that so(/) = 1. To fix this, we 
simply ignore when constructing /. Formally, let C\ = Supp(/x) \ {0} C {0, l} k be the 2 l — 1 strings 
in the support of /i except 0. Now we would like to take our submodular function / to be SM(C), but 
we instead take it to be SM]/ 2 (C), as this function is much more easily computed. 

Definition 4.2. For a parameter I > 0, let k = 2 — 1 and take C\ C {0, l} k as above. We define 
/i = SM 1/2 (C,). 

Note that using only C\ instead of the entire support costs us a little in that the completeness is now 
reduced from 1 to c^fi) > 1 — 2~ l , but one can hope (and it indeed turns out that this is the case) that 
this loss is compensated by a greater improvement in soundness. 

Also, we stress that s(fi) is typically not given by the average Si/ 2 {fi) (which is the quantity 
actually minimized by fa). Indeed, the points in C\ all have Hamming weight {k + 1) /2 and this is also 
where fa is typically the largest. This causes s(f) to be achieved by the p-biased distribution for some 
p slightly larger than 1/2. 

An obvious question to ask is whether using S M (C; ) would give a better result than using SMj^ / 2 (Cj) . 
For the values of I that we have been able to handle, it appears that the answer to this question is neg- 
ative: computing SM P (C;) for ap that approximately maximizes s p (fa) gives fa, indicating that we in 
fact have /, = SM(Cj). 

2 In the case when the optimum is not unique, we choose an arbitrary optimal / as SM(C). 



7 



4.2 Symmetric Functions 

One way of constructing symmetric functions would be to use the exact same construction as above 
but taking SM sym (Ci) rather than SM(C/). However, that is somewhat wasteful, and we achieve better 
results by also taking symmetry into account when constructing the family of strings C. 

Thus, we alter the above construction as follows: rather than identifying the coordinates with all 
non-empty subsets of [I], we identify them with all subsets of [/] of odd cardinality. In other words, we 
take k = 2 and associate [k] with all T C [I] such that \T\ is odd. The resulting distribution \x is 
symmetric in the sense that if x is in the support then so is x. 

In this case, both the all-zeros string and the all-ones string 1 are in the support which is not 
acceptable for the same reason as above. Hence, we construct a submodular function by taking Cj Sym = 
Supp(/x) \ {0, 1} (note that |Cf ym | = 2 l - 2). 

Definition 4.3. For a parameter I > 0, let k = 2 l ~ 1 and take Cf ym C {0, l} k as above. We define 
/f ym = SM^(Cf ym ). 

In this case, since we removed 2 out of the 2 l points of the support of \x to construct C^ sym , we have 
that C/1 (/f ym )>l-2^. ^ 

An salient feature of /^ sym is that all strings of C^ sym have Hamming weight exactly k/2. By 
Lemma I27TI this causes s p (/f ym ) to be maximized by p = 1/2 (the monotonicity of the function a 
in Lemma [2~T1 is not immediately clear). This means that in the symmetric case, using SM^ y ™(Cf ym ) 

rather than SM sym (Cf ym ) is provably without loss of generality. 

5 Concrete Bounds 

Unfortunately, understanding the behaviour of the two families of functions ft and /j Sym (or even just 
their soundnesses) for large I appears difficult. There seems to be two conflicting forces at work: on 
the one hand, C\ only has 2 l — 1 = k points so even though /; is forced to be large on these there may 
still be plenty of room to make it small elsewhere. But on the other hand, since C\ is a good code the 
elements of C\ are very pread out (their pairwise Hamming distances are roughly k/2), which together 
with the submodularity condition appears to force /; to be large. 

In this section we study ft for small I, obtaining our hardness results. As discussed towards the 
end of the section, there are indications that the inapproximability given by ft actually becomes worse 
for large I and that our results are the best possible for this family of functions, but we do not yet know 
whether these indications are correct. 

5.1 Symmetric Functions 

We start with the symmetric functions, as these are somewhat nicer than the asymmetric ones in that 
their symmetry turn out to cause s(/f ym ) to be achieved by p = 1/2, i.e., s(/f ym ) simply equals the 
average of / ; sym . Table Q] gives a summary of the completeness, soundness, and inapproximability 
obtained by /f ym for I G {3, 4, 5}. We now describe these functions in a more detail. 

As a warmup, let us first describe the quite simple function /| ym : 2^ — > [0, 1] (we leave the even 
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Inapproximability s/c 


3 


3/4 


5/8 


5/6 < 0.8334 


4 


7/8 


43/64 


43/56 < 0.7679 


5 


15/16 


709/1024 


709/960 < 0.7386 



Table 1: Behaviour of /, y for small I. 



e(S) 





1 2 
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\S\ 
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8 








1/8 2/8 


3/8 


4/8 


5/8 


6/8 


7/8 


1 


1 










19/32 


22/32 


24/32 


26/32 


2 












20/32 


23/32 


24/32 



Table 2: Description of / 5 (S) as a function of \S\ and e(S) for \S\ < 8. 



easier function / 3 y to the interested reader). Its definition is as follows: 

if \S\ > 4 
if \S\ < 4 

if \S\ = 4 and S is in Cf m 
otherwise 

That /f ym (S) is submodular is easily verified. It is also easy to check that Lemma |2~T1 applies and 
therefore we have that s(/| ym ) = Si^iff'™), which is straightforward to compute (note that |C| ym \ = 
14): 

^-^(»GD-j +i 0-! +i (S)-3 +m - i+ (G)- m )-s)-S 

Let us then move on to the next function /| ym : 2t 16 ] — > [0, 1], giving an inapproximability of 
0.7386. It turns out that one can take /| ym (S) to be a function of two simple properties of S, namely 
its cardinality \S\, and the distance from S to C| ym . Specifically, for IS | < 8 let us define the number 
of errors e(S) as the minimum number of elements that must be removed from S to get a subset of 
some set in C| ym . Formally 

e(S) = min IS \ CI, 

or equivalently, d(S, Cf ym ) = 8 — \S\ + 2e(S), where d(S, Cg ym ) is the Hamming distance from the 
binary string corresponding to S to the nearest element in C| ym . Table [2] gives the values of /| ym 
for all \S\ < 8, and for \S\ > 8 the value of / 5 sym (S) is given by / 5 sym (S). Note that, for sets with 
e(S) = 0, i.e., no errors, /| ym (S) is simply |S|/8, which is what one would expect. However, for sets 
with errors, /| ym (S) has a more complicated behaviour and it is far from clear how this generalizes to 
larger I. 

Veryfing that /| ym is indeed submodular is not as straightforward as with /| ym . We have not 
attempted to construct a shorter proof of this than simply checking condition (Q]) for every S, i and j, a 



ins) = 



f(S) 
|5|/4 
1 

3/4 



9 



1 


c 


<fl) 


Inapproximability s/c 


3 


7/8 


< 0.6275 


< 0.7172 


4 


15/16 


< 0.6508 


< 0.6942 



Table 3: Behaviour of f\ for small I. 



task which is of course best suited for a computer program (which is straightforward to write and runs 
in a few seconds). 

A computer program is also the best way to compute the soundness s(/f y ). It is almost obvious 
from inspection of Table [2] that /| ym satisfies the monotonicity condition of Lemma 12.11 (the only 
possible source of failure is that the table only implies that the average of /| ym on sets of size 6 
is between 20/32 and 24/32, and that the average on sets of size 7 is between 23/32 and 28/32). 
It turns out that the conditions of Lemma 12.11 are indeed satisfied and that the average of /| ym is 
Si/ 2 (/ 5 Sym ) = 709/1024. 

Concluding this discussion on ff^ m , it is tempting to speculate on its behaviour for larger I. We 
have made a computation of /| ym : 2^ — > [0, 1], under the assumption that /| ym (S) only depends on 
IS" | and the multiset of distances to every point of the support of C| ym . Under this assumption, our com- 
putations indicate that s(/| ym ) ~ 0.7031 giving an inapproximatibility of s(/ 6 sym )/(31/32) « 0.7258, 
improving upon /| ym . However, as these computations took a few days they are quite cumbersome to 
verify (and we have not even made a careful verification of them ourselves) and therefore we do not 
claim this stronger hardness as a theorem. 



5.2 Asymmetric Functions 

We now return our focus to the asymmetric case. Table [3] describes the hardness ratios obtained from 
fi for the cases I = 3 and / = 4. 

We begin with the description of the function : 2^ — > [0, 1]. Similarly to the definition e(S) 
used in the description of /| ym , let us say that S C [7] has no errors if it is a subset or a superset of 
some C G C3. In other words, if |S| < 4 it has no errors if it can be transformed to a set in C3 by 
adding some elements, and if |S| > 4 it is has no errors if it can be transformed to a codeword by 
removing some elements. The function fa is as follows: 



|S|/4 if |S| < 4 and has no errors 

(7-|5|)/3 if |S| > 4 and has no errors 

11/24 if \S\ = 3 and has errors 

17/24 if 15*1 = 4 and has errors 



As with /| ym , it is not completely obvious that /3 satisfies the submodularity condition and there are 
a few cases to verify, best left to a computer program. 

The average of fa is 637/1024 0.622. However, since f% takes on its largest values at sets of 
size (k + l)/2 = 4, the p-biased average is larger than this for some p > 1/2. It turns out that s(fy) 
is obtained by the p-biased distribution for p « 0.542404, giving s(/ 4 ) « 0.627434 < 0.6275. 

We are left with the description of : 2^ [0, 1], which is also the most complicated function 
yet. One might hope that f\ shares the simple structure of the previous functions - that it depends only 
on I SI and the distance of S to the nearest C G C4. However, the best function under this assumption 
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turns out to give a worse result than f%. Instead, / 4 depends on \S\ and the multiset of distances to all 
elements of C 4 . 

To describe f^, define for S C [15] the multiset T>(S) as the multiset of distances to all the 15 
strings in C4. For instance, for S = 0, V{S) consists of the number 8 repeated 15 times, reflecting the 
fact that all strings of C4 have weight 8, and for S € C4 we have that V{S) consists of the number 8 
repeated 14 times, together with a single 0, because the distance between any pair of strings in C4 is 8. 

Table |4] describes the behaviour of f±{S) as a function of \S\ and P(5)J^] In the table V{S) is 
described by a string of the form d^d™ 2 . . ., with d\ < d% < . . . and ^m, = 15, indicating that m\ 
strings of C4 are at distance d\ from S, that 1712 strings are at distance d2, and so on. Thus, for S = 
the description of V(S) is "8 15 ", and for S € C 4 the description of V{S) is "O^ 14 ". 

The #S column of Table H] gives the total number of S C [15] having this particular value of 
(\S\,T>(S)), and the last column gives the actual value of f^, multiplied by 448 to make all values 
integers. 

Again, checking that / 4 is submodular is a tedious task best suited for a computer. The average of 
fi is 9519345/(448 • 2 15 ) « 0.6485, but, as with f 3 , s(/ 4 ) is somewhat larger than this. It turns out 
that the p maximizing s p (/ 4 ) is roughly p ^ 0.526613, and that s(/ 4 ) w 0.650754 < 0.6508. 

Finally, we mention that as in the symmetric case, we have made a computation of the next func- 
tion, /s, again under the assumption that it depends only on the multiset of distances to the codewords. 
Under this assumption it turns out that Si/ 2 (/5) ~ 0.6743, meaning that the inapproximability ob- 
tained can not be better than s 1 / 2 (/5)/(31/32) w 0.6961 which is worse than the inapproximability 
obtained from / 4 . 
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To prove Theorem 13.21 we only give a dictatorship test with certain properties. The method of trans- 
lating such a test into a hardness result under the UGC, going back to the results of Khot et al. [[6] for 
MAX Cut is by now quite standard (see e.g. 0T|). 

A.l Background: Polynomials, Quasirandomness and Correlation Bounds 

To set up the dictatorship test we need to mention some background material. 

A function F : {0, l} n — > R is said to a be a dictator if G{x) = X{ for some % G [n], i.e., G simply 
returns the i'th coordinate. 

Now, any function F : {0, l} n — > R can be written uniquely as a multilinear polynomial F(x) = 
J2sc[ n ] c s xS f° r some set of coefficients eg, where x s := Yiies Xi - With this view there is an obvious 
extension of the domain of F to [0, l] n (or even M. n , but we shall only be interested in [0, l] n ). 

We say that such a polynomial is (d, r)-quasirandom if for every i € [n] it holds that 



Note that a dictator is in some sense the extreme opposite of a (d, r)-quasirandom function as a dictator 
is not even (1, r)-quasirandom for r < 1. 

The main tool to obtain the soundness is the following "noise correlation bound" result of Mossel 
191 (Theorem 6.6 and Lemma 6.9), which we state here in a simplified form in order to keep the amount 
of background necessary to a minimum. 



A Proof of Theorem 3.2 




ieSC[n] 
\S\<d 
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Theorem A.l. Let e > and let p be a balanced pairwise independent probability distribution over 
{0, l} k such that fJ,(x) > Ofor every x G {0, l} k . Then there exists d, r > such that the following 
holds for all n. 

Let Fi, . . . , Ffc : {0, l} n — > [0, 1] be (d, r)-quasirandom functions. Then 



E 

i«i r ..,i«„ 



n*i(tx>i, 



, w n . 



.1=1 



II i/: 



i=l 



where w\, . . . ,w n G {0, l} k are drawn independently from p and Wij G {0, 1} denotes the jth 
coordinate ofwi. 



A.l Dictatorship Test 

We now give the dictatorship test, which by the standard conversion from dictatorship tests to hardness 
implies Theorem |3.2| In the dictatorship test, the function / : [0, l] k — )• [0, 1] has the same role as the 
function / : {0, l} k —> M + in Theorem 13.21 - as mentioned in the previous section we can take the 
unique multilinear extension to make the domain the entire [0, l] k , and the range can be taken to be 
[0, 1] without loss of generality by simply scaling the function down. 

Theorem A.2. For every e there are d, r > such that the following holds. Let f : [0, l] k — > [0, 1] 
and p be a balanced pairwise independent distribution over {0,l} fc . There is a dictatorship test A, 
which when run on a function F : {0, l} n — > [0, 1] has the following properties: 

1. A queries F in k positions x\ , . . . , £ {0, l} n and then accepts with probability f(F(xi), . . . ,F( 

2. If F is a dictator then A accepts with probability at least c^(f) — e. 

3. If F is (d, r)-quasirandom then A accepts with probability at most s(f) + e. 
Proof. Let // be the distribution over {0, l} k defined by 

fjf = (1 — e)p + eU, 

where U denotes the uniform distribution (in other words, a sample from p' is obtained by sampling 
from /x with probability 1 — e and otherwise, with probability e, taking a uniformly random element 
of {0, l} k ). Note that /i' is also balanced pairwise independent, and more importantly it satisfies 
fi'(x) > for all x € {0, l} k which will allow us to apply Theorem lA.il 
Now the test A is as follows: 

• Pick a random k-by-n matrix X over {0, 1} by letting each column be a sample from //, inde- 
pendently. 

• Let Xl ,...,x k G {0, l} n be the rows of X and let = . . . , F(x k )) G {0, l} fc be 
the values of F on these k points. 

• Accept with probability f(F(X)). 

The first property of A is clear from its definition. For the completeness property, note that if F is 
a dictator then F(X) G {0, l} k is just some column of X and therefore distributed according to p! , so 
that 

E[f(F(X))] = E [f(x)] = (1 - e) E [/(*)] + e E [f(x)] > E [f(x)] - e = c^f) - e. 
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We now turn to the soundness property of A. Let e' = e/2 k and let d and rj be given by Theo- 
rem [Aj] with parameter e' and the distribution y! . 

Now consider the multilinear expansion f(x) = Ylsc[k] c s x s of / and let us analyze the expec- 
tation of f{F(X)) term by term. If F is (d, r)-quasirandom then by Theorem IA. 1 1 (letting Fi = F for 
i 6 S and letting Fi be the constant one function for i S) we have 



nY[F( Xi )]-l[E[F\ 



ies 



its 



<e'. 



Let p = E[F] be the bias of the function F. Then, JT ig5 E[F] = p' 5 ' equals the expectation of x 
under the p-biased distribution. Summing over all S we obtain 



kJ 



E[f(F(X))]< Y, cs E [x s ] + 2«e 



E 

*~{0,1}^j 



[f( x )]+e = s p (f) + e<s(f) + e, 



giving the desired soundness property. 
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